AITopics | policy uncertainty

Collaborating Authors

policy uncertainty

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On The Presence of Double-Descent in Deep Reinforcement Learning

Veselý, Viktor, Todorov, Aleksandar, Sabatelli, Matthia

arXiv.org Machine LearningNov-11-2025

The double descent (DD) paradox, where over-parameterized models see generalization improve past the interpolation point, remains largely unexplored in the non-stationary domain of Deep Reinforcement Learning (DRL). We present preliminary evidence that DD exists in model-free DRL, investigating it systematically across varying model capacity using the Actor-Critic framework. We rely on an information-theoretic metric, Policy Entropy, to measure policy uncertainty throughout training. Preliminary results show a clear epoch-wise DD curve; the policy's entrance into the second descent region correlates with a sustained, significant reduction in Policy Entropy. This entropic decay suggests that over-parameterization acts as an implicit regularizer, guiding the policy towards robust, flatter minima in the loss landscape. These findings establish DD as a factor in DRL and provide an information-based mechanism for designing agents that are more general, transferable, and robust.

artificial intelligence, deep reinforcement learning, machine learning, (15 more...)

arXiv.org Machine Learning

2511.06895

Genre: Research Report > New Finding (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Geopolitics, Geoeconomics and Risk:A Machine Learning Approach

Ortiz, Alvaro, Rodrigo, Tomasa

arXiv.org Machine LearningOct-15-2025

We introduce a novel high-frequency daily panel dataset of both markets and news-based indicators -- including Geopolitical Risk, Economic Policy Uncertainty, Trade Policy Uncertainty, and Political Sentiment -- for 42 countries across both emerging and developed markets. Using this dataset, we study how sentiment dynamics shape sovereign risk, measured by Credit Default Swap (CDS) spreads, and evaluate their forecasting value relative to traditional drivers such as global monetary policy and market volatility. Our horse-race analysis of forecasting models demonstrates that incorporating news-based indicators significantly enhances predictive accuracy and enriches the analysis, with non-linear machine learning methods -- particularly Random Forests -- delivering the largest gains. Our analysis reveals that while global financial variables remain the dominant drivers of sovereign risk, geopolitical risk and economic policy uncertainty also play a meaningful role. Crucially, their effects are amplified through non-linear interactions with global financial conditions. Finally, we document pronounced regional heterogeneity, as certain asset classes and emerging markets exhibit heightened sensitivity to shocks in policy rates, global financial volatility, and geopolitical risk.

artificial intelligence, machine learning, sovereign risk, (16 more...)

arXiv.org Machine Learning

2510.12416

Country:

Asia > Russia (0.29)
Europe > Ukraine (0.15)
North America > Mexico (0.14)
(45 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Foreign Policy (1.00)
Government > Commerce (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Economic Policy Uncertainty: A Review on Applications and Measurement Methods with Focus on Text Mining Methods

Kaveh-Yazdy, Fatemeh, Zarifzadeh, Sajjad

arXiv.org Artificial IntelligenceAug-20-2023

Economic Policy Uncertainty (EPU) represents the uncertainty realized by the investors during economic policy alterations. EPU is a critical indicator in economic studies to predict future investments, the unemployment rate, and recessions. EPU values can be estimated based on financial parameters directly or implied uncertainty indirectly using the text mining methods. Although EPU is a well-studied topic within the economy, the methods utilized to measure it are understudied. In this article, we define the EPU briefly and review the methods used to measure the EPU, and survey the areas influenced by the changes in EPU level. We divide the EPU measurement methods into three major groups with respect to their input data. Examples of each group of methods are enlisted, and the pros and cons of the groups are discussed. Among the EPU measures, text mining-based ones are dominantly studied. These methods measure the realized uncertainty by taking into account the uncertainty represented in the news and publicly available sources of financial information. Finally, we survey the research areas that rely on measuring the EPU index with the hope that studying the impacts of uncertainty would attract further attention of researchers from various research fields. In addition, we propose a list of future research approaches focusing on measuring EPU using textual material.

economic policy uncertainty, economics, policy uncertainty, (13 more...)

arXiv.org Artificial Intelligence

2308.10304

Country:

North America > Mexico (0.14)
Asia > Macao (0.04)
Asia > Japan (0.04)
(25 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Banking & Finance > Economy (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Data Science > Data Mining > Text Mining (0.81)

Add feedback

PATO: Policy Assisted TeleOperation for Scalable Robot Data Collection

Dass, Shivin, Pertsch, Karl, Zhang, Hejia, Lee, Youngwoon, Lim, Joseph J., Nikolaidis, Stefanos

arXiv.org Artificial IntelligenceJun-1-2023

Large-scale data is an essential component of machine learning as demonstrated in recent advances in natural language processing and computer vision research. However, collecting large-scale robotic data is much more expensive and slower as each operator can control only a single robot at a time. To make this costly data collection process efficient and scalable, we propose Policy Assisted TeleOperation (PATO), a system which automates part of the demonstration collection process using a learned assistive policy. PATO autonomously executes repetitive behaviors in data collection and asks for human input only when it is uncertain about which subtask or behavior to execute. We conduct teleoperation user studies both with a real robot and a simulated robot fleet and demonstrate that our assisted teleoperation system reduces human operators' mental load while improving data collection efficiency. Further, it enables a single operator to control multiple robots in parallel, which is a first step towards scalable robotic data collection. For code and video results, see https://clvrai.com/pato

artificial intelligence, machine learning, robot, (18 more...)

arXiv.org Artificial Intelligence

2212.04708

Country:

North America > United States (0.14)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.35)

Add feedback

How to Enable Uncertainty Estimation in Proximal Policy Optimization

Bykovets, Eugene, Metz, Yannick, El-Assady, Mennatallah, Keim, Daniel A., Buhmann, Joachim M.

arXiv.org Artificial IntelligenceOct-7-2022

While deep reinforcement learning (RL) agents have showcased strong results across many domains, a major concern is their inherent opaqueness and the safety of such systems in real-world use cases. To overcome these issues, we need agents that can quantify their uncertainty and detect out-of-distribution (OOD) states. Existing uncertainty estimation techniques, like Monte-Carlo Dropout or Deep Ensembles, have not seen widespread adoption in on-policy deep RL. We posit that this is due to two reasons: concepts like uncertainty and OOD states are not well defined compared to supervised learning, especially for on-policy RL methods. Secondly, available implementations and comparative studies for uncertainty estimation methods in RL have been limited. To overcome the first gap, we propose definitions of uncertainty and OOD for Actor-Critic RL algorithms, namely, proximal policy optimization (PPO), and present possible applicable measures. In particular, we discuss the concepts of value and policy uncertainty. The second point is addressed by implementing different uncertainty estimation methods and comparing them across a number of environments. The OOD detection performance is evaluated via a custom evaluation benchmark of in-distribution (ID) and OOD states for various RL environments. We identify a trade-off between reward and OOD detection performance. To overcome this, we formulate a Pareto optimization problem in which we simultaneously optimize for reward and OOD detection performance. We show experimentally that the recently proposed method of Masksembles strikes a favourable balance among the survey methods, enabling high-quality uncertainty estimation and OOD detection while matching the performance of original RL agents.

machine learning, masksemble, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2210.03649

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Switzerland > Zürich > Zürich (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Multiplicative Controller Fusion: A Hybrid Navigation Strategy For Deployment in Unknown Environments

Rana, Krishan, Dasagi, Vibhavari, Talbot, Ben, Milford, Michael, Sünderhauf, Niko

arXiv.org Artificial IntelligenceMar-13-2020

Learning-based approaches often outperform hand-coded algorithmic solutions for many problems in robotics. However, learning long-horizon tasks on real robot hardware can be intractable, and transferring a learned policy from simulation to reality is still extremely challenging. We present a novel approach to model-free reinforcement learning that can leverage existing sub-optimal solutions as an algorithmic prior during training and deployment. During training, our gated fusion approach enables the prior to guide the initial stages of exploration, increasing sample-efficiency and enabling learning from sparse long-horizon reward signals. Importantly, the policy can learn to improve beyond the performance of the sub-optimal prior since the prior's influence is annealed gradually. During deployment, the policy's uncertainty provides a reliable strategy for transferring a simulation-trained policy to the real world by falling back to the prior controller in uncertain states. We show the efficacy of our Multiplicative Controller Fusion approach on the task of robot navigation and demonstrate safe transfer from simulation to the real world without any fine tuning. The code for this project is made publicly available at https://sites.google.com/view/mcf-nav/home.

controller, deployment, exploration, (16 more...)

arXiv.org Artificial Intelligence

2003.05117

Country:

Oceania > Australia > Queensland > Brisbane (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Generalized Thompson Sampling for Sequential Decision-Making and Causal Inference

Ortega, Pedro A., Braun, Daniel A.

arXiv.org Artificial IntelligenceMar-18-2013

Recently, it has been shown how sampling actions from the predictive distribution over the optimal action-sometimes called Thompson sampling-can be applied to solve sequential adaptive control problems, when the optimal policy is known for each possible environment. The predictive distribution can then be constructed by a Bayesian superposition of the optimal policies weighted by their posterior probability that is updated by Bayesian inference and causal calculus. Here we discuss three important features of this approach. First, we discuss in how far such Thompson sampling can be regarded as a natural consequence of the Bayesian modeling of policy uncertainty. Second, we show how Thompson sampling can be used to study interactions between multiple adaptive agents, thus, opening up an avenue of game-theoretic analysis. Third, we show how Thompson sampling can be applied to infer causal relationships when interacting with an environment in a sequential fashion. In summary, our results suggest that Thompson sampling might not merely be a useful heuristic, but a principled method to address problems of adaptive sequential decision-making and causal inference.

artificial intelligence, machine learning, thompson, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1186/2194-3206-2-2

1303.4431

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback